Robust Value Function Approximation Using Bilinear Programming
نویسندگان
چکیده
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose approximate bilinear programming, a new formulation of value function approximation that provides strong a priori guarantees. In particular, this approach provably finds an approximate value function that minimizes the Bellman residual. Solving a bilinear program optimally is NP-hard, but this is unavoidable because the Bellman-residual minimization itself is NP-hard. We therefore employ and analyze a common approximate algorithm for bilinear programs. The analysis shows that this algorithm offers a convergent generalization of approximate policy iteration. Finally, we demonstrate that the proposed approach can consistently minimize the Bellman residual on a simple benchmark problem.
منابع مشابه
Robust Approximate Bilinear Programming Robust Approximate Bilinear Programming for Value Function Approximation
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms ...
متن کاملRobust Approximate Bilinear Programming for Value Function Approximation
Value function approximation methods have been successfully used in many applications, but the prevailing techniques often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing spe...
متن کاملGlobal Optimization for Value Function Approximation
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms ...
متن کاملA Global Optimization Algorithm for Sum of Linear Ratios Problem
We equivalently transform the sum of linear ratios programming problem into bilinear programming problem, then by using the linear characteristics of convex envelope and concave envelope of double variables product function, linear relaxation programming of the bilinear programming problem is given, which can determine the lower bound of the optimal value of original problem. Therefore, a branc...
متن کاملFunction Approximation Approach for Robust Adaptive Control of Flexible joint Robots
This paper is concerned with the problem of designing a robust adaptive controller for flexible joint robots (FJR). Under the assumption of weak joint elasticity, FJR is firstly modeled and converted into singular perturbation form. The control law consists of a FAT-based adaptive control strategy and a simple correction term. The first term of the controller is used to stability of the slow dy...
متن کامل